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The Markov chain model, with extensions to cover the phenomena 
of arrivals and departures, was applied to a population of savings accounts, 
in a savings institution, to forecast the size distribution, total number 
of accounts and total amount oi sawings of the population. The stochastic 
processes governing the behavior of the population were first assumed to 
be time stationary. This assumption was then relaxed and an econometric 
meael was used to predict future values of the parameters of the non- 
stationary model. Both models were validated by comparing predicted 
size distributions, total number of accounts and total amount of savings 
against observed values. The chi square goodness of fit test was used 
in the comparison. The fundamental matrix of the stationary model was 
also used to predict the equilibrium distribution and related measures of 


the population. 
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tf. INTRODUCTION 


D. PURPOSE 

It is the purpose of this thesis to develop and evaluate two analytical 
models which can be used to forecast the structure of a population of savers 
and the level of savings of a savings institution. The population of savers 
is divided into a finite number of classes and the structure is the distri- 


bution of savers among the classes. 


Bi BACKGROUND 

While it is difficult, if not impossible, to predict the future behavior 
of an individual it is believed that the aggregate behavior of a population 
is less erratic and, therefore, more amenable to analysis and prediction. 
Assuming that a large populaticn has considerable inertia, current trends 
can be used to project into the future. : 

The rate of change of the structure and characteristics of @ popu- 
lation can, at times; be considered to be dependent upon its size, 
external forces which affect the members of the population and the 
response to these forces. . 

In the case of the population of savers in savings institutions, it 
has been observed that members of this population are not very responsive 


to changes in economic conditions. Thus, during periods of constant 


rate of expansion or contraction in the business cycle, external forces 


iL). 





affecting this population may be considered to be constant and a time 
stationary Markov Chain model may be used to study the behavior of the 
population. 

However, Curing turning points in the business cycle or periods 
of rapid economic changes, external forces may be sufficientiy large to 
affect the savings pattern of the savers so that the stationarity assumption 
may no longer hoid. Under these circumstances a mcre comprehensive 
model which takes into consideration the effects of external ponaition’ 
on the behavior of the savers would be required. The major problem in 
constructing sucn a model would be in discovering the factors which 
affect the population, measuring the effect of these factors and the effects 
of interaction between various factors. 

The effect of competition between various savings institutions for 
a larger share of the savers’ market could not be modeled because of the. 
lack of data. However, it is believed that, in the short run, the savers' 
market is ina state of equilibrium and the share of the market captured 
by a savings institution is relatively constant. Thus it can be assumed 


that competition does not affect the savers' behavior to such an extent 


that, not considering its effect, would render any model inadequate. 


cS. REVIEW OF MARKOV CHAIN MODELS 
The basic model used in this study was introduced by A. A. Markov 


(1856-1922) around 1907. This model was first applied in economics to 


Zz 





the analysis of income and wage distributions by Solow £2 if Wgly JIS Hep Ue 
The same model was used by Hart and Prais Fi2fin 1956 in a study of 
business concentration. 

The model assumes that a population of entities can be classified 
into a finite number of classes. The population is observed at equi- 
distant time points. The number of entities observed to move from one 
class to another is assumed to be generated by a stochastic process. 
The probability of transition is assumed to depend only on the class the 
entity is in, at the current time interval, and not on where it had pre- 
viously been. This process of change can be completely described by 
a transition matrix, P, as shown below. The Pj element is the probability 
that an entity currently in the ith class will be found in the jth class 
after one time period. If the stochastic process is time stationary then 


the matrix does not change with time. 


P MATRIX 


Ending in Class 


I Il M 
Beginning I Pay Pio Pim 
in 
Class II Po 1 Poo Soy 
M Pay p p 


m2 ; ; an 


13 





In most of the research studies using this model the general pro- 
cedure has been to observe some pattern of change over time and, assuming 
that the stcchastic process is time stationary, estimate the transition 
probabilities and project the future change. 

Proiecticn of expected number of entities in each class can be com- 


puted as follows: 
i t 
n n 


E 
let the number of entities in each class at time t be n orc AD 


ik é 
if the transition probabilities are known then the expected number of entities 


t t 
n e 9 in 4 2 ° ° ‘5 1 ar 
li Dio i Pim i 


moving out of the ith class is p. 
The expected number of entities in each class at time t+l can be 
found by adding up all the entities that have moved into the class and 


those that did not move out. Thus 


a = nt + ni + n’ 

1 Pay 2 Pa] m?ml 
oie) n + a + : 

2 Poy Po? m?m2 
oe a + ail + ae 

m m?ml m>m2 Ss m?mm 


In matrix notation the above expressions can be compactly written 


as: 
witt . ni xp 
IE 
where N = "al nt ; at ), a l xm vector 
2 m 
+ 
nt _ (ning Maes ns), lee Vector 


P = matrix as defined earlier. 
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t+2 t t+ 
N can be computed by replacing N by N : in the above expres- 


sion. This is equivalent to multiplying nt by P x P. The disiribution after 
k periods can thus be obtained by multiplying nN‘ by P raised to the kth 
power. 

This basic model has two majcr limitations. First, it assumes that 
the total number of entities in the system is fixed. This assumption has 
been violated frequently in practical applications of this model as changes 
due to entities entering the system, leaving it or losing identity by merging 
are the rule rather than exception. Second, the assumption that the 
stochastic process is time stationary is untenable over long periods. 
Changes in numerous exogenous variables such as wage rates, technology 
and legal requirements are likely to result in changes in the transition 
probabilities. 

Adelman fh] in 1958, overcame the first limitation by introducing 
the concept of a reservoir of potential entrants, from which entrants may 
come and to which exants may go. There was an operational difficulty in 
estimating the size of the population of potential entrants. However, 
Adelman pointed out that the exact size of this population need not be 
known if one was dealing with the proportion of entities in each class 
rather than with the exact number of entities. She therefore assumed 
that the size of the reservoir to be 100,000. The reason given for this 
choice was that it must be large relative to the number of entities in the 


system . 
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Stanton and Kettunen f22j in 1967 confirmed Adelman's observation 
but went on to demonstrate that: "The number of potential entrants to an 
industry or to a population has a definite and measurable effect on subse- 
quent projections made for that distribution when Markov processes are 
used." Thus, if the number of entities in each class is required, an 
arbitrary choice of the size of the population of entrants will not work. 

Duncan and Lin fof in 1972 proposed that arrivals could be treated 
as a separate stochastic process. The entry of an entity into the system 
is viewed as a two-stage process; first, arriving into the system, then 
entering into a particular class. One could then estimate the parameters 
of the entire process by observing the arrivals, the distribution of arrivals 
among the classes and the transitions between classes separately. He 
denoted the data by Z which was composed of the number of arrivals into 
each class (A) and the number of transitions between each class {X). The 
set of parameters of the process was denoted by 8 = (aor wz =) where P 
was the transition matrix, p was the multinomial vector of probability of 
an arrival entering a particular class and % was the vector of parameters 
of the arrival distribution. The sampling distribution was then written as 


follows: 


ree 22) a(x a) f (x | A=a)f (a) 


“8 


f (x | A=a)f (é) 
p 


(ome) 


The likelihood function could then be written as 


L(8) = Lyn (P)-Ly(o. x ) 





Three reasons were given for the importance of the factorization 


shown above: 


Ha. 


The first factor L P) depends on Z only through 


x | eat 
the transition counts; 
The second tactcr L tp, T ) depends on Z only on the 
observed entries; and 


Likelihood inference is reduced to two distinct and 


simpler problems.” 


Anderson and Goodman fo} in 1957 proposed a number of statistical 


tests for the following hypotheses 


a. 


that the transition probabilities of a first order chain 
are constant; 

that in case the transition probabilities are constant, 
they are specified numbers; 

that the process is a uth Markov chain against tne 


alternative it is rth but not uth order. 


Because of the factorization of the likelihood function Duncan and 


Lin concluded that the methods of Anderson and Goodman are applicable 


to a system with changing number of entities. 


Hallberg [11] in 1969 challenged one of the most demanding assump- 


tions of the Markov chain model that the transition probabilities are 


constant regardless of time. He proposed to overcome this problem by 


relating transition probabilities to economic variables and to use these 


relations to predict future values of transition probabilities. For some 
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unknown reasons he regressed transition probabilities against the logarithms 
of exogenous variables. Some predicted transition probabilities did not 

fall within the range of zero to one range. He then suggested setting 
negative predictions to zero and to nomnalize each row of the transition 


matrix by dividing each element by the row sum. 


1. REMARKS 

Despite the limitations of the basic Markov chain model it has been 
successfully used in a variety of situations. The Duncan and Lin approach 
extends the basic model to include arrivals and departures. This can be 
done with little additional effort. To extend the model to cover the possi- 
bility of non-stationary transition probabilities is a considerably more 
difficult task. The first problem is acquiring a data base which is large 
enough to yield precise estimates of transition probabilities. The data 
must also span a long period so that the factors which affect the transition 
probabilities have an opportunity to vary. The second problem is to 
identify these factors and to obtain a functional relationship between 
transition probabilities. The third problem is to predict the future values 
of these factors. The Becca of the non-stationary Markov chain model 
is only as good as the prediction of these factors. The approach suggested 
by Hallberg can be improved by transforming the estimates of transition 
probabilities into logits (the logarithm of the estimates of odds of transition). 
This will ensure that the predicted transition probabilities are between 


zero and one. 


18 





The basic Markov chain model is used in this paper to model the 
behavior of a population of savers at a savings institution. The Duncan 
and Lin approach is used to treat the phenomena of entries and exits. A 
nonstationary Markov chain model (Model II of this paper) has also been 
developed. The parameters of the models were estimated with data from 
five quarters. The models were then validated with data from the following 
five quarters. The Chi-square Goodness of fit test was used to compare 


the predictive power of the two models. 
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II. MODEL OF A POPULATION OF SAVERS 


A. GENERAL 

The population of savers is first divided into m classes by the 
acount of savings each saver has in his savings account. Each saver is 
free to increase or decrease his savings and to leave the savings insti- 
tution by closing his account. The population is observed periodically. 
A projection of the structure of the population and the amount of savings 
in each class, based on these cbservations, is desired. A Markov chain 
model can be used for this purpose provided the basic assumptions of 


the model are not violated. 


B. ASSUMPTIONS 
es The probability that an account moves from class i to class j 

depends only on class i and does not depend on the past history of the 
account. This is obviously not true for an individual “eae but possibly 
holds for the population of a given class. 

eZ Each saver e independently of other savers. If savers 
act in unison en a Markov model will fail as the assumption of inde- 
pendence is no longer valid. However, the assumption generally holds 
even if savers are affected by the same factors. The transition proba- 
bilities may shift because of these factors but the randomness in action 


of individual savers is still there. 
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ce The distribution of the size of accounts within a class is 
independent of the number of accounts that move in or out of that class. 
This assumption is not required for Markov model but is necessary if one 
has to determine the amount of savings from a knowledge of the number 
of accounts in each class. This assumption is generally true if the 
number of accounts in each class is large relative to the net change in 
each period. This assumption can be violated if the number of accounts 
in each class is small and if the class boundaries are wide. 

4, The transition probabilities, arrival rate and the distribution 
of entrants among states are time stationary. This assumption may hold 
during periods of constant expansion or contraction of the business cycle. 
However, it is not expected to hold over long periods and during times 
when external forces change the saving pattern of savers. This assump- 
tion is relaxed in Model II where an attempt was made to discover their 
functional relationship with economic factors and other exogenous 


variables. 


S. DESCRIPTION OF MODEL I 
il, The Transition Matrix 
Model I has only one stochastic process, the basic Markov 
chain model. The number of arrivals is considered to be constant and the 
proportion of arrivals entering each class is also constant. 
Let m = total number of classes including one class of closed 


accounts 


753} 





t = time, measured in periods, 0, 1,2Z2...T 


e = the accumulated number of accounts that have ciosed 
at time t 
E t t 
= it ie 
f (f, 3 ) 


= number of accounts in each active class att 


en = (c! ot an 
t a te oT) 


= number of new accounts entering each active class 


at time t 

i Pe Pim 

eh Pom 
Pp = 

Pl Pm2° °° Pmm 


Let class 1 be the class which contains all the closed accounts. 
It is assumed that an account in the inactive state will not re-enter the 
ave states. Thus Pp), = 1.0 and Pay — Ono l= 2% « « Mm 

‘The expected number of accounts at time t can be computed from 


the following relationship 9 


t-1 
E(e, ft) = (0 £,)P, + (0 c') oe e. 

j=0 
wheret=0,l1... Tand P, =1 


ZZ 





The first term on the right of the equality sign is the expected 
number of accounts in each class at time t from the original population 
fA . The second term is the expected number of accounts in each class 
derived from those accounts which join the system at each period. Thus, 
the accounts that arrive by period 1 would have undergone i-1 periods of 


transition. Those that arrive by period 2 would have undergone t-2 periods 


of transition. Those arriving at time t would undergo no transition as 


As the stochastic process has been assumed to be time 
stationary the elements of the P matrix are constant and P. is just the 
single period P matrix raised to the oe power. 

The expected total number of accounts in the system at 
time t is just the sum of the elements of i 

If the size distribution of accounts within each class is con- 
stant over the period of prediction, then the amount of savings in each 
class can be estimated by multiplying the expected number of accounts 
by the average amount of savings in that class. 

an The Equilibrium Distribution 

i prevailing conditions were to persist the structure of the 
population will reach an equilibrium in which the number of accounts 
leaving each class is balanced by an equal influx of accounts from the 
Other classes. The limiting distribution is given by [9]: 


Mine (Glee en = O)_-) 


n-o 
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where Q is the sub- matrix of P obtained by removing the column of 
transition probabilities from the classes of active accounts (Class II to 
Class XI) into the class of closed accounts (Class I), and the row of 
transition probabilities of the class of closed accounts. 

The matrix, (I - or is often called the fundamental matrix, 
denoted by M. The een element of this matrix is the expected number of 
periods that a new account entering class i when it joins the system 
will spend in class j before closing. 

The expected number of periods that a new account entering 
class iwhen it joins «: - system will remain in the system can be found 
by summing the ice row of the fundamental matrix. 

The above results and further treatment can be found in 
Chapter 3 of Ref. fi3f 

o- Prediction Interval for Single Step Transition 

The predictions made with Model I are point estimates. They 
do not provide any information as to how close they could be to a future 
observation. A prediction interval which gives the range of values that 
a future observation Wei take say ninety percent of the time would be 
of greater vale to a decision maker. 

The number of accounts in each class is the sum of m binomial 
random variables. If the number of accounts in each class is large then 
the binomial random variables can be approximated by normal random 
variables. The sum of normal random variables is another normal random 


variable. Thus a prediction interval can be constructed using this 
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approximation. For one step transition the prediction interval can be 
easily constructed. However, for more than one step transitions the 

task of constructing a prediction interval becomes rather difficult. The 
problem is that after the first transition the number of accounts in each 
class becomes random and the expression for the unconditional variance 
of the number of accounts becomes quite unmanageable. The expressions 
for the variance of the number of accounts in each class, the total number 
of accounts, the amount of savings in each class and the total amount 

of savings for single step transition are listed below. The derivation can 


be found in Appendix A. 


Let a be the number of accounts in class j at beginning of 
time period a 
Pi be the transition probability of an account from 
class i to j 
a 
N be the number of accounts inthe system at beginning 
of time period a 
z - be the amount of savings in class j at beginning of 
time period a 
oe be the total amount of savings in the system at 
beginning of time period a 
m 
aril a 
Var(n = ee Seo oe 
( 5 ) n. p, P,,) 
l= 
at] a+] 5 - a+1l atl 
Var(N ) = 7 Var(n. ~)+2 z Gouin) sss 
r- j : j k 
j=2 j=2 — 
j<k 





a+] at+l a 
Coviae nee) = me oe a 
i=2 
j7kK 
Let zy be the amount of savings in an account which has moved 
Pave class j 
a+] ‘i a 
= -+ _ 
Var(Z. ) . n p, jVar(z, .) E (2, )n. p, (1 P) 
m 
atl atl c a 
CovlZ, 2) ) = ‘ss ~(n, Pe ey ee. 
m m-l m 
+ + = 
eo) = ». jet nee ee Cov(Zet" 25" 
1=2 j=e 1=3 


Using these expressions the prediction intervals for a single 


step transition are as follows: 
90% Prediction Interval 
peemumoer or accounts in class i = f. nee aol oD.< (Var(n,)) 1/7 
m 


of total number of accounts = > f, ite nO 5 xX (var(n)) 72 
j=Z 


Vi, 
of amount of savings in class i= S. rele ox (Var(Z,)) ~ s 


of total amount of savings = S = cE La 64 x (var(z))}/2 


j=2 
where 
i. = expected number of accounts in class i= E(n.) after 
one period 
Ss. = expected amount of savings in classi = E(Z,) after 


one period 
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N = total number of accounts in the system after one 


period (random variable) 


n, = number of accounts in class i after one period (random 
variable) 

Z = amount of savings in class i after one period (random 
variable) 

Z = total amount of savings in the system after one period 


(random variable) 

The model can be extended to cover the case of stochastic 
arrivals. Assuming the arrival process to be independent of the Markov 
chain process the expression for the number of accounts is the same as 
for the case of non-stochastic arrivals. The only difference is in re- 
placing the vector of entrants (c') by the product of the expected number 
of arrivals and the multinomial vector of probability of entering each 
active class. Thus, 

c’ =  E(R) (p, P32» + - PL) 
where R = random number of arrivals 


Des i=2, 3. ..m= probability of an arrival entering class i 


vector of entrants into the active classes 


OQ 
] 


The expressions for variance are changed to take into account 
the variability introduced by the additiona! stochastic processes. 
a+] 
Let a be the random number of entrants into class j at time 


period atl 


G7. 





Sas . 
ae : be the number of arrivals at time period atl 


R be the random variable of arrivals 


+] atl at+l 
: = r p,( = P) 


< 
o~ 
aD 
a 
| 
I 


+] 2 
var(e, ) p,(1 ~ p,JE(R) + py Var(R) 
Since arrivals have been assumed to be independent of the 


accounts in the system 


m 
+ 
Var(n®* a) = Vaden 1 »y my), (Cl =o) 
J J 1 1) ij 
i=2 
m m-1 m 
7 ir + + 
Var(N° ss = >» Var(n* 1) + 2 » Covin* 1 , a - 
i. j = — j k 
j=2 j=2 =3 
Saiewarl, — S a 
NA a said eigen 
i=2 
jk 
Let EZ) be the expected amount of savings in an account in class j 
ae be the amount of savings in an account which has just entered 
class j 


atl Z x a 
Var(Z. Swe) pet =p) + n. p..Var(z, .) + 
( ( p, | P,) 2 i Pay ( kj 


Zea 
B(z, ,) n. p, - p,) 


m 
Sill separ! | a 
COWS 2, ) = 2 ae (n eee i ea 
. m -] m 
+. 2 + 
var(Z""-) = var(ze*+) +2 > Cov(zs es 
i=2 j=2 1=3 : 
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IDs DED Chere Ie Or MODEL II 
Ie The Arrival Process 

It was observed that the number of new accounts opened in 
each quarter was between seven hundred and one thousand. For such 
large arrival rates, an assumption that the arrival rate is normally dis- 
tributed would be reasonable. However, it was felt that the arrival 
distribution could be affected by external factors like state of the national 
economy, seasonal effects and level of promotional or advertising activity 
of the savings institution. Thus the following linear econometric model 


was considered: 


A = arate), Ceara x : X. + 
: "0 Se 2 Se Oe” 
where 
Ar = Number of new accounts opened in each quarter 
x) = Dummy variable for quarters of the year 
X, = California non-agricultural employment 
X, - Advertising and promotional expense of the savings 
institution 
Xx, = Prime commercial paper rate, 4 - 6 months 
X. = U.S. Government securities rate, 6 months 
Xe = Corporation bonds rate 
xo = Wholesale price index 
Xo = U.S. Government securities rate, 3 months 
X5 = California personal income 
Xx = Wes. totalecredit 
10 
= = Normally distributed random variable with zero 


mean and constant variance 
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The linear model was selected because of its simplicity 

and because of the lack of data required by more complex models. 
Zs, The Size Distribution of New Accounts 

The size distribution of new accounts may also change with 
time and external conditions. To model this change, the probabilities 
of new accounts entering each class were related to the same set of 
exogenous variables listed in sub-section 1. Direct application of least 
squares to the probabilities may yield predictions of future values that 
are outside the zero to one range. To Bverconre this potential area of 
difficulty the estimates of the probabilities were first transformed into 
logits. 

Gr. Logit Analysis 

Logit analysis is a special application of Econometrics to 
Situations in which the dependent variable has a dichotomous character. 
The object is to estimate the probability of occurrence of a specified 
event given a set of prevailing conditions. For application in this study 
one looks for the probability that a new account enters a particular class 
and the probability that a account will move from one class to another, 
given a set of external conditions. Direct application of least squares 
may result in the prediction of probabilities outside the zero and one 
range. A monotonic transformation can overcome this difficulty. One 
simple transformation is to divide the relative frequency by one minus 
the dene frequency. This quantity is an estimate of the odds of the 


occurrence of the event. This transformation is still restrictive as the 
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new variable can take on only positive values. This problem can be 
overcome by taking the logarithm of this quantity. The logarithm of the 
estimated odds is termed the logit of an event. The model used to predict 
future values of the parameters of the entrants distribution and the tran- 


sition probabilities of the transition matrix wes as foilows: 


Log(p./(1 - p.)) = ata, X,tapX, ae 2s 0X16 +e 


as = cerca +e 
Log(p, /(1 P,,)) b (tb, X,b,X, 2: ce 


There is a further restriction that the sum of the propabilities 
of the entrants distribution must equal one and the row sum of the tran- 
sition matrix should equal one too. The approach taken in this paper 
was to sum up these predicted probabilities and then divide each by the 
sum. 

4. Transition Matrix of Model II 

The transition matrix of Model II is allowed to change with 
external factors thus the t steps transition matrix is no longer the single 
step matrix raised to the aad power but is the product of t matrices. 

‘ ie Predictions with Model II 

To use Model II the first step would be to obtain predictions 
of future values of those factors that are in the regression equations. 
The parameters of the arrival process, entrance process and the transition 
probabilities are then predicted. The expected number of accounts in 


each class can then be computed by the following expression: 
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iF t-1] t- 1 
ee i) = (0 2 Raed 
Beh = (ey as (Oc) JL P, 
j=0 k=0 k=j 
where 
c. = cumulative number of closed accounts 
tet t . 
i = (f, f, ee: fo) = vector of number of accounts in each 
active class at time t, t=0, 1 ...T 
P = irons itom matrix at time j, j=-0, 1, ...T 
PL = Transition matrix atemme k, k=0, 1, ...T1 
) ye ait t 
C, = E(Ar )(p, Pas ee P) 
= Vector of expected number of entrants in each active 
class at time t. 
m 
t 
E(N)) = 2, f 
j= 
where 
t 
E(N ) = Expected total number of active accounts in the system. 
t t 
UZ.) = f x E(z.) 
J J J 
where 
in i 
E(Z.) = Expected total amount of savings in class j 
E(z.) = Average amount of savings in each account in class j 
tt t 
Fiza}: = Vb 
jaa” 
where 
iC 
EZ) = Expected total amount of savings in the system at 


time t. 
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III. THE DATA AND ESTIMATION OF PARAMETERS 


a. DESCRIPTION OF DATA BASE 
1 General 

The data used in this study was obtained from the local branch 
of a savings institution. The population of passbook accounts was selected 
for study as it has greater mobility than other types of savings accounts. 

The quarterly earnings ledgers for 1971, 1972 and the first 
two quarters of 1973 were made available for this study. The quarterly 
earnings ledgers contain the following information which have a bearing 


on this study: 


iL. Identification number of each active account. 

ie Amount of savings as of the last day of each quarter. 
3. Amount of earnings for the quarter. 

4. Summary statistic of total number of active accounts. 
oe Summary statistic of total amount of savings. 

om Sammars statistic of total earnings withdrawn. 

7. Summary statistic of total earnings accrued. 


The basic Markov chain model requires the initial distribution 
of the subject population and the transition probability matrix for complete 
specificaion. A preliminary sample of two hundred accounts showed that 
seventy-two percent of the population would have balances below two 
thousand dollars. A very large random sample would, therefore, be re- 


quired to pick out the behavior of large accounts. It was decided to pick 
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a stratified sample instead. Thus, the sample of accounts examined 
consisted of three blocks of about two hundred each. The first consisted 
of accounts with balances exceeding ten thousand dollars on 31 March 
1971. The second block consisted of accounts between two to ten thous- 
and dollars and the third block consisted of accounts below two thousand 
dollars. The quarterly balance of each account was recorded. To determine 
the initial distribution of the population, the amount of savings of all the 
accounts with balances exceeding one thousand dollars on 31 March 1972 
were recorded. The accounts were sorted by their order of magnitude and 
then divided into ten classes. The class intervals were selected to en- 
sure that the amount of savings in each class was of the same order of 
magnitude. The first eight classes uniformly spanned the interval $1 - 
$15,999. The ninth class contained all accounts between $16,000 - 
$19,999 and the tenth class covered the range from $20,000 - $100,000. 
Accounts exceeding $100,000 were rare: there were six of them in the 

31 March 1972 population. Including them in the largest class couid 
result in an unstable mean of the amount of savings in that class; they 
were thus eliminated from the population. It is believed that these large 
accounts are important in the prediction of total acount of savings and 
should, therefore, be treated separately. For the purpose of this study 
the amount of savings for accounts exceeding $100,000 was considered 


to be unchanged over the period of observation. 
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oe Arrival Rate 
The arrival rate was determined by taking the difference 
between the last identification numbers of consecutive quarters. This 
method failed to provide an accurate estimate of the arrival rate for 
Quarter IV-72. It was subsequently learned from the management that 
a block of about two hundred accounts were used to facilitate some 
financial transactions of newly arrived servicemen to Monterey. These 
accounts were subsequently closed. With this information the arrival 
rate for Quarter IV-72 was accordingly reduced. 
oe The Size Distribution of New Accounts 
The distribution of new accounts was estimated by taking a 
random sample of two hundred and fifty from the population of new accounts 
for each quarter. 
4. The Validation Sample 
To test if the models with parameters estimated from six 
hundred and twenty-two accounts could predict the behavior of the popu- 
lation, a sample comprising one-fourth of the accounts of Quarter I-73 
was taken to be used asa base for comparison. A chi square test was 
performed to check if the predicted distribution fits the observation. 
om Summary Statistics 
A second check on the predictive power of the model was 
made by comparing the total number of accounts and total amount of 


Savings predicted for Quarters II-72 to II-73 against the summary statistics 


for these quantities. 
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6. Total Number of Accounts 
It was found that the statistics for total number of active 
accounts included those that had been closed. It appeared that these 
accounts were purged from the records about once a year. As this infor- 
mation would be used as a check on the accuracy of prediction it had to 
be precise, thus, a page count of each quarters' ledger was conducted. 
The information on the total number of accounts and the arrival rate is 


shown in Table I. 


TABLE I 


TIME SERIES CF TOTAL NUMBER OF 
ACCOUNTS AND ARRIVAL RATE 


# OF NEW HORM ae Or MARGINAL 
QUARTER neces ie ACCOUNTS CHANGE 


al UK 16895 UK 
7 754 17059 +164 
I-71 817 17181 +122 
IV-71 599 luigi Gy: 
I-72 778 17257 + 80 
7 860 17354 + 97 
III-72 791 17483 +129 
IV-72 798 17752 +269 
ee 998 18013 +261 
i= 72 896 18087 + 74 


Nb: UK - Unknown 
ve Total Amount of Savings 
The trend in the total amount of savings was studied by 
fitting a least squares line through the observations. Tne data on total 


amount of savings are contained in Table II. 
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TOTAL NUMBER OF SAVERS IN THOUSANDS 
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TABLE II 


Pitre obkins OF LOMAL AMOUNT OF SAVINGS 


TOTAL AMOUNT MARGINAL MEAN 


QUARTER OF SAVINGS ($M) CHANGE ($M) AMOUNT OF 

SAVINGS (S$) 
ee Fel 36.8345 UK 2130 20 
i 4 37.5140 Ole s 2S 07 
PEt =/ | 118) 5 eleA ils 1.3146 Bago / 
Pye7 1 SS Ohe(ee) the 20070 
1-72 40.5565 1.0374 Dos) Ober 1S 
Be /Z 41.5743 De UL 1 Gash p18) 
n= 7 2 42.1492 0.5749 2410.87 
7 2 42.4047 UZ 555 2200.72 
7 44.1283 ATES 2449.80 
ie 3 44.5614 0.443] Z403.73 


The standard deviation of the amount of savings in each 
account was estimated to be $5,314. The standard error of the mean was 
estimated to be $40.54. Using the t test, any two means differing by 
more than $66.86 are considered to be significantly different at the ten 
percent level of significance. Thus the hypothesis that the mean was 
constant over the period of observation was rejected. The average rate 
of increase in the mean was found to be l. 1158 percent. This increase 
could be partly accounted for by earnings accrued in the accounts. On 


the average, 95.01 percent of the quarterly earnings was retained in the 
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MEAN AMOUNT OF SAVINGS IN THOUSAND DOLLARS 
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institution, thus a quarterly increase of 1.219 percent in the mean could 
be expected if there is no change in the structure of the population. 
The following results were obtained by fitting the trend line 


to the total amount of savings: 


(1) Mean of total savings =OeseoS milion collars 

(2) Standard deviation = 2.4153 million dollars 

(3) Constant =a —BoOmOlilmillionaolilars 
myecoetiicient = b = 0.876 million dollars per quarter 


0.039 million dollars per quarter 


fsotandard error of b 


(6) Coefficient of determination =—=0) 196 
(7) Standard error of dependent = 0.286 million dollars 
variable 


During the period of observation the total amount of savings 
was increasing at a constant rate of 0.876 million dollars per quarter. 
The annual growth rate based on this would be 8.675%. 

It was found, on the average, that 95.01% of earnings was 
left in the accounts each quarter and so the annual growth rate caused 
by new accounts and increases in existing accounts less losses due to 
Eiesing or accounts and reduction in levels of savings would be 8.675% 
mmoo0l X 59.13% = 3.801% 

A second regression was performed using the marginal change 


as dependent variable. The following results were obtained: 
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TOTAL AMOUNT OF SAVINGS IN MILLION DOLLARS 
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(1) Mean of first differences = 0.9117 million dollars per quarter 


(2) Standard deviation = 0.4622 million dollars per quarter 
(3) Constant =a = 0.823 million dollars per quarter 

fyecoetficient = b = 0.020 million dollars per Peer tars 
(5) Standard error of b = 0.077 million dollars per Brearicr 


(6) Coefficient of determination Oe, Ol 


i 


(7) Standard error of dependent 0.460 million dollars per quarter 


variable 
It was concluded that there was no trend in the net change 


of total savings in each quarter over the period of observation. 


Be. ESTIMATION OF PARAMETERS 
is Model I 

Tne arrival rate can be estimated by adding up all the new 
accounts opened during the period of observation and dividing by the 
number of time periods. 

The distribution of new accounts can be estimated by taking 
samples from each batch of new accounts, adding up the accounts entering 
each class and dividing 5 the total number of accounts in the sample. 


Thus: 


where B, maximum likelihood estimate of the probability of a new 


th 
account entering the j class 
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t 


. Ea _ 
e = number of new accounts entering the j class at timet 


ir 
r = number of accounts in the sample of new accounts at 
time t 
T = number of periods of observation 
The average number of accounts entering each class can be 
found by: 
Da SN A 
c' = Ar 
(P»P. Pa) 
where 
c' = average number of new accounts entering each class at each 
time period 
A 
Ar = Maximum likelihood estimate of the arrival rate 


The stationary transition probabilities can be estimated by 


the following 2 


ras 
4 = ay at 
ij ee =r 


] 
Me 
= 
uae 
Se 
7S 
ctr 


where 
Bs, —_ Maximum likelihood estimate of the probability 
of transition from class ito class j in any one 
given period 
ns = Total number of accounts that have moved from 
Class i to class j over the period of observation 


(0 - T) 
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n = Totai number of accounts that were in class i at 
the beginning of each period 
(t) =  #Number of accounts that moved from class i to 
class j during the period between t-1 and t 
eat. = Number of accounts that moved from class i to 
class k during the period between t-1 and t 


Total number of accounts in class i at the time 


) 
—_—_ 
cr 
j 
— 
“ee” 
I! 


period (t-1) 


Anderson and Goodman 2 showed that as n, the total number 


yh eae 


of entities in the system, tends to infinity the set (n. ij ij 


has a joint normal distribution with means 0, variances p, - P,,) and 


covariances - é. p..p_, where JS JO f ieee anc 6 =], 
cee 1) ig ii 


gh 

This fact can be used to test if certain transition probabilities 
Pj have specified values Pi and if the transition probabilities are indeed 
stationary. 

aes Statistical Tests 

The chi square test of goodness of fit can be used to test 

hypotheses concerning transition probabilities. To test the hypothesis 
0 


that Pi; = Pay. i= lee, . ..m, the quantity, 


3 
> 
—N 


under the null hypothesis has an asymptotic chi square distribution with 


m-1 degrees of freedom. The null hypothesis is rejected if B differs 


44 





0 
from Pj to such an extent that the above test statistic exceeds the 
(1 - & ) percentile of the chi square distribution with m-1 degrees of 
freedom, where a& is the level of sianificance. 
A Diez | | 
As the variables n. P - P,) for different i are independent 
the summation over i is cistributed as a chi square distribution with 
m(m - 1) degrees of freedom. 


To test the hypothesis that the transition probabilities are 


stationary over the period of observation the following test statistic can 


be used 2 
2 2 m m 49 
= Ste > > n (t-1) } p (t)-p j /?, 
i=l f=1 j=l fH 
where 
n,(t-1) = total of entities in class i at time t-l 
p, ,(t = estimate of the transition probebility at time t, 
obtained by counting the number of transitions from 
class ito class j and dividing by n.(t-1) 
Bs, = estimate of the transition probability from class i to 
class j 
= T = 
2, n, (t) / n. (t) 
[= t=0 


The asymptotic distribution of this test statistic is chi square 


with m(m-1) (T-1) degrees of freedom. The number of degrees of freedom 


e 


is reduced from m{m-1)T by m(m-1), the number of parameters estimated. 





The chi square test is based on a statistic which follows a 
chi square distribution whenn, the total number of entities in the system, 
tends to infinity. Hence it has been customary of statistics text books 
to recommend that the smallest expected number of entities in each class 
should exceed five or ten. If this requirement is not met in the original 
classification then combination of neighboring classes, until the rule 
is satisfied, is recommended. Cochran 4 challenged this arbitrary 
rule claiming that the power of the test is reduced by pooling classes to 
conform to the rule. He found that for goodness of fit tests of bell shaped 
Curves such as the normal distribution there is little disturbance to the 
five percent level when a single expectation is as lowas 1/2. He con- 
tinued stating that the result is also true for the one percent level if the 
number of degrees of freedom exceeds six and that two expectations may 
be as low as one may be allowed with negligible disturbance to tne five 
percent level. 

Using Cochran's results, classes with small expectations 
were pooled to ensure that the smallest expected number of entities in 
each class exceeded one and no more than two classes had expected 
numbers less than two. The number of degrees of freedom was reduced 
from m(m-1) (T-1) by the number of classes eliminated. 

os Model II 

The predictor for arrival rate may be obtained by applying 

the metiod of least squares to the number of new accounts observed in 


each time period and the corresponding exogenous variables. 
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Tne distribution of new accounts is estimated in each period 
by dividing the number of new accounts entering each class by the total 
number of accounts in the sample. 

The transition probabilities p00 are estimated by dividing 
the number of accounts that moved from class i to class j at time t by 
the number of accounts in class i at time t-l. 

These estimates are maximum likelihood estimates as in 
Model I. They can be transformed into logits and then regressed against 
the set cf exogenous variables. 

4. Estimation of Transition Probabilities 

Each of the six hundred and twenty-two accounts was cate- 
gorized in accordance with the classification given in Section A. 1. of 
this chapter. The number of accounts in each class for each quarter 
during the period of observation is presented in Table III. The relative 
fraction of accounts, obtained by dividing the number of accounts in each 
class by six hundred and twenty-two, is shown in Table IV. 

It can be seen that twenty-seven percent of the accounts in 
the sample were closed after ten quarters. The proportion of active 
accounts in each class was found by dividing the number of accounts in 
each class by the total number of active accounts. The results are pre- 
sented in Table V. The time series of amount of savings in each class 
is presented in Table VI. 

A chi square test was performed to test if the distribution of 


active accounts had changed during the period of observation. The number 
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of degrees of freedom of the distribution of the chi square statistic is 
eighty-one and the ninetieth percentile of the distribution is 98.01. The 
chi square statistic was found to be 64.2. Thus, the null hypothesis 
that the distribution did not change with time could not be rejected. This 
result was rather surprising as it could imply that the probability of an 
account closing did not depend on the class it was in. 

Each account was examined at each quarter to determine if 
it had made a transition to another class. The transitions were accumu- 
lated in a transition count matrix. The ge element of this matrix is the 
number of transitions from the Ge class to the has class ina given quarter. 
An example of a transition count matrix is shown in Table VII. The 
transition count matrices for other quarters are contained in Appendix B. 

The estimate of each quarter's transition matrix was obtained 
by the method described earlier in this section. An example of the estimate 
of the transition matrix of Quarter II 71 is shown in Table VIII. The 
estimates for subsequent quarters are contained in Appendix C. 

A cumulative transition count matrix was formed by adding 
successive transition count matrices. Thus the cumulative transition 
count matrix of Quarter I-72 is the sum of the transition count matrices 
of Quarters II-71, IIJ-71, IV-71 and I-72. The cumulative transition 
count matrices are contained in Appendix D. 

The time stationary estimate of the transition matrix was 
Pitained by dividing each element of the cumulative transition count 


matrix by its row sum. For the sake of brevity the estimate of transition 
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matrices was termed CPM Z where Z was a Roman numeral indicating that 
the data used in the estimation came from the first Z quarter of the period 
of observation. Thus CPM V stands for the estimate of the stationary 
transition matrix using data from Quarter I-71] to Quarter I-72. CPM II 
through CPM X are contained in Appendix E. 
on Test of Time Stationary Assumption 

It can be seen from the transition count matrices that there 
are a large number of elements with zero or one transition counts. The 
chi square test could not, therefore, be applied directly. The classes 
of each row were combined so that tne smallest class had an expectation 
exceeding one count and no more than two classes had expectation of 


less than two counts. The following grouping was obtained: 


Class I Il IW Iv V Vi VII VIII Ix ik rea 
II .046 .883 .054 - 7 - - - - - non 
III mez ome meyss 104 2015 - = = = . -O1s 
IV .040 .040 .075 .711 .089 - ee 4G 
V FoS0= ieee 672 .104 - = - = 046 
VI  _ = .081 - .147 - = 536 - ~ ~ - BE 
VII NOAA SOG 75 = = s - .760 .084 - = 026 
VITI .064 - - SEC - - On Oc = 049 
ie O75—2 - = .083 - - .083 .636 - nize 
x G7 = = Ate = “ - = = 72.20 
a nO A - - .097 - - - - - .861 


The number of degrees of freedom for the above matrix is 
equal to the number of elements minus the number of linear constraints, 


(47-10). As the number of matrices is nine and the number of parameters 
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estimated in (47-10) the number of degrees of freedom for the distribution 
of the chi square statistic for the test of stationary transition probability 
matrix is (47-10) (9-1) = 296. 

The rejection region for 10% level of significance is 328.6. 
The chi square statistic was found to be 288.7 thus the null hypothesis 
that the transition probabilities were stationary could not be rejected. 

Di. The Initial Distribution of the Population 

The initial distribution of the population was determined by 
recording all accounts with balance exceeding one thousand dollars on 
31 March 1972. The number of accounts below one thousand dollars 
was found by taking the difference between the total number of accounts 
and the number of accounts recorded. The mean and variance of the amount 
of savings in an account in each class were estimated from this sample. 
Table IX is a summary of the data obtained. 

It can be seen that the estimate of the mean of each class, 
except for Classes II and XI is close to the midpoint of the respective 
Class intervais. All.-the means are below the midpoints as there are more 
accounts at the lower end of each class. The estimates of variance of 
Classes II to IX are very close because the class intervals are the same 
and the distribution of accounts in each class has the same general 
Shape. The estimates of variance for Classes X and XI show the importance 
of length of class interval on predictions of total amount of savings. 


The variance of the amount of savings of accounts in Classes XK and XI 
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TABLE IX 


PiZaeeloaeUITON OF THE ENTIRE POPULATION 
OF ACCOUNTS AT QUARTER I-72 


CLASS INTERVAL (S$) NUMBER OF MEAN (S) VARIANCE 
I 0 0 0 0 
II 1- 1999 WW e73 353 246544 
III 2000 - 3999 1793 2837 310372 
IV 4000 - 5999 1034 4916 317481 
V 6000 - 7999 563 6855 328649 
VI 8000 - 9999 366 8905 346948 
VII 10000 - 11999 Si Ov o7 362291 
VIII 12000 - 13999 209 12920 329649 
IX 14000 - 15999 153 14961 314260 
X 16000 - 19999 183 17791 1355376 
XI 20000 - 99999 205 27888 110502144 
100000 6 156558 2.983 x 10° 


can be reduced by the introduction of more classes to cover the same 
interval. However, this could lead to classes having smaller populations 
which may not possess the Markovian property. 

This paper took the compromise in selecting class intervals 
such that each class had a minimum of one hundred and fifty accounts. 
The six accounts that exceeded $100,000 were considered to be unchanged 
during the period of observation. These accounts added up to $0.94 
million. Thus the predicted amount of total savings could differ by one 


million dollars because of the action of a handful of savers. 
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joe The Size Distribution of New Accounts 

Each new account of the samples of new accounts was clas- 
sified according to the rule given in Section A. 1. of this chapter. The 
number of new accounts in each class for Quarter II-71 through Quarter 
II-73 is shown in Table XI. 

The maximum likelihood estimate of the probability of a new 
account entering each class was obtained by dividing the number of new 
accounts in each class by the total number of new accounts. The quarterly 
estimates of the probability of a new account entering each class and the 
time stationary estimates are presented in Table XII. 

A chi square test was performed to test the hypothesis that 
the probabilities were time stationary. The number of degrees of freedom 
of the distribution of the chi square statistic was seventy-two and the 
ninetieth percentile of the distribution is 87.84. The chi square statistic 
obtained was 68.8. Thus the null hypothesis could not be rejected at 
the ten percent level of significance. 

As a further check a one way analysis of variance was per- 
formed. The results are PA follows: 


Total number of observations = 22.510 


Average of all observations = ZI0S 636 
Standard error within groups = SA Bel 
Degrees of freedom = 2241 
Standard error between a = 11488.08 
Degrees of ee ~ 8 
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F statistic = ee 


O) ken ans 


Level of significance 
Thus the null hypothesis that the mean amount of savinas 
of new accounts is constant over the period of observation is rejected 
at the 10% level of significance. 
The mean and standard deviation of the amount of savings of 


the samples of new accounts are as follows: 


TABLE X 


MEAN , STANDARD DEVIATION, MEDIAN, MAXIMUM 
VALUE AND MINIMUM VALUE OF SAMPLES OF NEW ACCOUNTS 


Quarter Mean Standard Median Maximum Minimum 
(S) Deviation Value Value 
II-71 Olen 4 Sollee 2 LSS 25000 I 
i—7 1 1S) et) i S036. 32 SILAS OZ onl if 
1V-71 Za00. 3.8 Oe lhe ES) 300-0 50000. 1. 
I-72 ZN69=10 SSSG, 17 224.5 40000. te 
1-72 onl 256 8641.02 340.50 Ti Ons aay ll 
Pit -72 2812.04 8264.18 Coc now OCC. ling 
IV-72 Tyla | eye: 7642 .48 146.50 100000. Ibe 
1-73 4054.80 18161.52 LL 8 200000. lle 
II-73 . OASIS 6 Se Ga a0i7 o iOS 50000. Z 


Nb. sample size = 250 

The Duncan's Multiple Range Test showed that the means of 
Quarters II-71, IlI-71, IV-71, I-72, IV-72 and II-73 are significantly 
different from that of Quarter I-73 at the ten percent level of significance. 
The means of Quarters II-71 and II-72 are also significantly different 
The differences between the 


at the ten percent level of significance. 


means of other quarters were not considered significant. 
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The means are greatly influenced by the large accounts. 

The mean of Quarter I-73 would drop to $3267.87 if the $200000 account 
were deleted from the sample. This reduced mean will be significantly 
different from that of Quarter [I-71 only. 

Deleting accounts that were greater than $100000 from the 
samples reduced the means of Quarters JI-72, III]-72, IV-72 and I-73 to 
2792.10, 2421.59, 1879.04 and 2292.24 respectively. The maximum 
difference between the means is 1120.76 which is considered insignificant 
at the ten percent level of significance. 

oF Predictors of Transition Probabilities 

The corresponding estimates of transition probabilities of 

each quarter were grouped together, transformed into logits and regressed 


against the following set of exogenous variables: 


~ 
tt 


Dummy variable for quarters of the year 


1 
X, = California non-agricultural employment 
X., = Advertising and promotional expense of the savings 

Mrstit ul On) 

x, = Pri commercial paper rate, 4-6 months 
xX. = U.S. Government securities rate, 6 months 
X - Corporation bonds rate 
Xo = Wholesale price index lagged by one period 
Xo = U.S. Government securities rate, 3 months 
Xo - California personal income 
Xi = Ww, totalecredit 


The values of these variables are contained in Table XIV.. 
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There was some difficulty in transforming the transition 
probabilities as a number of them was equal to zero and the logit of 


zero is minus infinity. The following rule was used to get around this 


problem: 

He If there are more than two estimates for pW), t=I[-7l, 
hMil—-/ eee i—-73, edual to zero, assume that p, ,t) is 
constant over the period of observation and use the time 
stationary estimate obtained for Model I. No regression 
will be performed for these elements. 

2: If there are one or two zeros in the estimates, replace the 


zeros by the time stationary estimate and proceed with logit 

transformation and regression. 

The number of transition probabilities removed by these rules 
was seventy-two. As there were one hundred and ten elements in the 
transition matrix that required estimation, application of these rules left 
a balance of thirty-eight elements for regression. 

The transition matrix for Quarter II-73 was not included in 
the regression in order that it could be used to test the correctness of 
the predictors obtained with data from earlier periods. Thus, there were 
eight data points in the regressions instead of nine. 

In the first regressions performed, it was found that Xo, 


U. S. Government securities rate, 3 months, X., California personal 


i 9 


income and X09: U.S. total credit were highly correlated with each other 
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and some of the other exogenous variables (R 7% .98). To reduce the 
problem of multicollinearity, these three variables were dropped from 
the regression equations. 

The following criteria were used to determine if the variance 
of the logits of transition could be explained by the exogenous variables: 

a The F statistic obtained by the ratio of the estimate of the 
variance before and after the introduction of an independent 
variable must exceed 2.06, the eightieth percentile of the 

E746 Perstribution . 

gh The coefficient of determination, R 7 muUstiexcecad 02/0. 

Of the thirty-eight regressions only ten were found to be 
significant according to these criteria. As each row of the transition 
matrix would be divided by the sum of its elements these ten elements 
could cause significant changes to the transition matrix. 

The predictors for the ten logits of transition, obtained by 


regression, are as follows: 


“pq Ce 0.085X, - 0.300X, 
| (0.049) (0.085) 
L Peace = OrosOx. - 8.48ex 
— (0.150) ° (3.375) / 
L,, = 70.374 + 0.086X, + 0.217X, 
(0.045) (0.078) 
L = 2.627 - 0.402X 
ee (0.107) > 
Leg = ~"H.091 + 0.223%, + 6.620X, 
(0.066) (3.653) 
Ly, = ~3.001 - 0.498%, - 0.257X, + 0.361X, 
(0.171) (0.119) (0.220) 
Ly = 9.163 + 0.136X, - 0.734X, - 2.264X 
(0.020) (0.233) (1.535) 
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L omy oo + O.143X a 0.140X 1: O.130X 


lee (0.073) + (0.052) ° (0.094) > 
ic = i eooiey ~ O2387 x - O2083x - 7 Ob GX 

pe (0.080) (0.056) ° (2.275)! 
ip = ~6.469 + Cy esieh 

ah Ie (0.087) > 


These logits were then transformed back into probabilities by 
taking the anti-logarithms and dividing by one plus the anti-logarithms 


of the logits. Thus, 


A 


y= exp, I/O + explL,) 


The frequency of appearance of each exogenous variable is 


as follows: 


VARIABLE FREQUENCY 
1 6 
2 0 
3 4 
a 2 
u o 
6 0 
7 4 


The estimates of transition probabilities that were found to 
vary significantly with the set of exogenous variables appeared to have 
a seasonal effect as the dummy variable appeared most frequently in the 
regressions. 

An increase in X., U. S. Government securities rate, would 


o 


result in an increase in the probability of an account to move from 


Nb. The number in brackéts below each regression coefficient is the 
Standard error of the coefficient. 
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Class XI to Class X. A possible explanation is that savers in Class XI 
will reduce their passbook account savings and invest in U. S. Government 
securities when the securities rate increases. However, a consistent set 
of explanations could not be given for the ten predictors so a non-casual 
approach had to be followed. 

The transition probabilities without predictors were considered 
to be stationary during the period of observation. Thus the nonstationary 
matrix was formed by replacing ten elements of the estimate of the stationary 
matrix with predicted values. To ensure that each row add up to one, each 
element was divided by the rwo sum. Selected transition matrices used 
in Model II are contained in Appendix F. 

A chi square test was performed to test if the predictors could 
predict the transition matrix for Quarter II-73. The predicted matrix was 
formed by replacing ten elements of the Quarter I-73 cumulative matrix 
with values obtained with the predictors and normalizing each row. The 
problem of small expected number of transitions in certain elements of the 
matrix was resolved by combininb classes of each row in the manner 
described in Section A. 5. The ninetieth percentile for the chi square 
distribution with 37 degrees of freedom is 48.84. The chi square statistic 
obtained in the test was 35.25, thus, the null hypothesis, that the pre- 
dicted matrix and the observed matrix of Quarter II-73 were the same, 
could not be rejected. 

es Predictors of Arrival Rate 
The number of'new accounts opened in each quarter was 


regressed against the same set of exogenous variables listed in sub-section 
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oe The predictor of arrival rate, measured in thousands per quarter, 
was found to be as follows: 


Ae Oe Oa2 = 0.073X, + 0.094X 
(0.017) (O=029) 


5 

The standard error of each coefficient is contained in the 
bracket below each coefficient. The square of the multiple correlation 
between the arrival rate and the exogenous variables, x, and Kes was 
0.846. The standard error of Ar before and after the regression was 
O@meroo7 and 0.045. 

According to this predictor, the number of new accounts 
opened per quarter decreases as the year progresses, as x) , the dummy 
variable for quarters, takes on values 1, 2, 3 and 4 for the four quarters 
of the year. The number of new accounts opened would also increase as 
the U. S. Government securities rate increases. No apparent reasons 
could be found for this relationship. Predictions are compared with 
observations in the following table. 


TABLE XV 


PREDICTED ARRIVAL RATE AND ACTUAL RATE OBSERVED 


QUARTER PREDICTION OBSERVATION 
Weer Z STA 860 
N= 0 7p ou 
a Z (aks “Shs 
I-73 SUCHE Shite) 
II-73 On 7 896 
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10. Predictors of the Probabilities of a New Account Entering 
Each Class 


The estimates of the probability of a new account entering 
each class obtained for Quarter II-71 through Quarter II-73 were collected 
together. They Reis transformed into logits and regressed against the 
set of exogenous variables listed in sub-section 8. Using the criteria 
given in sub-section 8 to determine if the exogenous variables ina re- 
gression could explain the variance of the logits, only four predictors 


were accepted. They are: 


4 = Leahy = 0.082x) ~ 0.466x, 
(0.029) (0.180) 
L. = 6.187 - 0.089%, -. 0.979%, 
(0.029) (02246) 
L, = -4,.482 - 0.184X/, + 3.053%, 
(0.049) lh073) 
Lio = -10.725 + 0. 184x, + 0.998X 
(0.094) (Ge332) 


The standard error of each coefficient is contained in the bracket below 
each coefficient. 

The logits are transformed back to estimates of probabilities 
by: 

10h) / (1.0 + 10%;)) 

Logarithms to the base of 10 were used in both the forward 
transformation and the inverse transformation. The base of the logarithm 
does not affect the results of the regressions. 

Predictions of the number of new accounts in each class 


were checked by means of the chi square test. The number of degrees 


of freedom of the distribution was thirty and the ninetieth percentile of 
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the distribution is 40.26. The chi square statistic obtained was 36.87. 
Thus, the hypothesis that the predicted distributions matched the obser- 
vations could not be rejected. 

The predicted arrival distributions for Quarters II-72 to II-73 


are contained in Appendix G. 
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IV. MODEL VALIDATION 


A. VALIDATION OF MODEL I 
.. Prediction of Sample Population Behavior 

As there were no entries into the sample population changes 
to the structure were caused by accounts moving between classes and by 
accounts closing. Thus the basic Markov chain model could be used to 
model the behavior of this population. 

It was decided to use the data from the five quarters, Quarter 
I-71 through Quarter I-72, to estimate the time stationary transition matrix 
and then use the matrix to predict the structure of the sample population 
for Quarter II-72 through Quarter II-73. Predictions could then be com- 
pared against observations and the chi-square test be used to determine 
the goodness of fit. 

CPM V, the estimate of the time stationary transition matrix 
with the first five quarters' data, was used to predict the number of accounts 
in each class and the amount of savings in each class. The results of the 
predictions on the number of accounts is contained in Table XVI. The 
actual number observed and the chi-square statistic for each class are 
presented next to the predictions. 

The predictions were expected to diverge more and more from 
observations as time progressed as errors would accumulate. The chi- 
Square statistic for the first prediction was 3.49 and the value for the 
fifth prediction was 11.91. These correspond to the fourth percentile and 
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the seventieth percentile of the chi-square distribution with ten degrees 
of freedom. The predicted distribution after five quarters still provided 
a reasonably good fit to the observations. 

The predicted amount of savings in each class and the actual 
amount observed are presented in Table XVII. The predictions did not 
match the observations as well as the predictions of number of accounts. 
The error in prediction of total amount of savings amounted to 10.6 percent 
after five quarters. The difference between predicted total amount of 
savings and the amount observed could be explained by the fact that the 
predicted number of accounts for the larger classes, class VII to class XI, 
were generally smaller than the number observed. The error in the number 
of accounts, though relatively insignificant in absolute magnitude, when 
multiplied by the average amount of savings would amount to a substantial 
sum. Thus the estimates of transition probabilities between classes with 
Pow average amount of savings per account and those with high average 
amount of savings per account would have to be precise to yield more 
accurate predictions of total amount of savings. 

A relatively small number of large accounts can increase the 
variability of total amount of savings significantly. The error in prediction 
for Quarter II-73 amounted to about folt'rr hundred and fifty six thousand 
dollars. Of this amount four hundred and forty two thousand dollars were 
contributed by twenty two accounts in classes VIII, IX, X and XI. It 
would seem to appear that there is no easy way to reduce the variability 


in total amount of savings caused by this small group of savers. 
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If the time stationary assumption is not violated then it is 
legitimate to estimate the transition matrix with data from the entire period 
of observation. The increase in data should yield better estimates of 
transition probabilities. Thus CPM X, the transition matrix estimated 
with all ten quarters' data, was used in predicting the number of accounts 
and the amount of savings in each class. The results are presented in 
Appendix H. 

To demonstrate the importance of data on predictions, CPM II, 
the transition matrix estimated with data from Quarter I-71 and Quarter 
II-71, was also used to predict the number of accounts and the amount of 
savings in each class. The results are also presented in Appendix H. 

The chi-square statistics obtained using CPM V, CPM II 


and CPM X are compared in the following table: 


TABLE XVIII 


COMPARISON OF CHI SQUARE STATISTICS OBTAINED 
WITH CPM V, CPM II AND CPM X 


QUARTER 
MATRIX ec — 7 2, Sire lays lanes: 
CPM V 3.49 2.45 i tls 10.74 tele t 
CPM II (aso 13.84 Soe 07 5 ae Gof 
CPM X Se 216 (neal oes 5) G08: Si oy 


The tenth percentile and the ninetieth percentile of the chi 


square distribution with ten degrees of freedom are as follows: 
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90 as a criterion to determine if the fit is acceptable 


it could be seen that predictions with CPM V and CPM X passed the test 


Using F 


for the entire period of prediction whereas predictions with CPM II were 
only acceptable for the first two periods. 
The total amount of savings predicted using CPM V, CPM II 


and CPM X are compared in the following table: 


TABLE XIX 


COMPARISON OF TOTAL AMOUNT OF SAVINGS 
OBTAINED WITH CPM V, CPM II AND CPM X ($M) 


QUARTER 

MATRIX II-72 III-72 IV-72 I-73 II-73 
CPM V Core 3.509 2. 86 eel oY 
CPM II O53 Soe 3.060 2 ASI 2.664 
CPM X e727 3.613 caiel0 3.389 3.280 
ACTUAL @ 620 8.565 3.509 3.404 3.418 


The superiority of predictions with CPM X is apparent. The 
percentage error in predicting the total amount of savings of Quarter II-73 
is 4.0 which is less than half of that obtained using CPM V. The importance 
of accurate estimates of transition probabilities is clearly demonstrated 
by the above comparisons. 

Dive Prediction of Behavior of Population 

To predict the behavior of the entire population the model has 

to include the process of arrivals and entrants. As the sample size was 


small (about 3.5% of the population) it was decided to use the entire data 


base to estimate the transition matrix. 
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The average arrival rate (number of new accounts opened per 
quarter) was found to be 800.7 and the distribution of new accounts was 


estimated to be as follows: 


CLASS P, 

II 0.7813 
Hil 0.0680 
IV 0.0484 
V 0.0187 
0.0124 

VII 70 e6 
VIII 0.0089 
IX iO Tt 
xX 0.0076 
Al 0.0280 


The estimates were obtained by adding up the number of new accounts in 
each class over the period of observation and dividing by the total number 
of new accounts sampled. 

The number of accounts in each class was predicted by 
adding the expected number of accounts moving into or remaining in that 
class from the population of accounts already in the system and the 
number of new accounts entering that class. The expression used in the 
computation can be found in Section C of Chapter II. 

The predicted total number of accounts and the total amount 


of savings are shown in the following table: 
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TABLE XX 


PREDICTED TOTAL NUMBER OF ACCOUNTS AND 
TOTAL AMOUNT OF SAVINGS AND OBSERVED VALUES 


QUARTER 72 Dare IV=72 I-73 iv 3 


TOTAL # 2 leden De 17345 17447 UY, Biexf) 17664 17776 
OF 

ACCOUNTS 7A OS ie 7 Sod 17483 17485 17746 1732:0 
TOTAL PRED. 45.65 a9297 oly (as! os, 60.74 
AMOUNT OF 

SAVINGS ACT. Bee 7 AZ 1D 42.40 44.13 44.56 


The maximum error in predicting the total number of accounts 
was 82 which was about half a percent of the total number of accounts. 
This indicated that the process of arrivals and the process of departures 
were probably as described by the model during the period of prediction. 

The failure of the mode! to predict the total amount of savings 
could be due to the failure of the model to predict the structure of the popu- 
lation or a violation of the constant average amount of savings in each 
class assumption. 

To test the hypothesis that the error in total amount of savings 
was caused by error in predicting the number of accounts in each class, a 
sample comprising one-fourth of the population at Quarter I-73 was taken 
and used to compare with the predicted structure of active accounts. The 
chi square test was used to determine the goodness of fit between the 
predicted distribution and the distribution of the sample. 

The number of degrees of freedom of the distribution of the 


chi square statistic is eight and the ninetieth percentile of the distribution 


(ES, 





meelo.3s6. the chi square Statistic obtained was 111.0, thus, the null 
hypothesis that the predicted distribution and the distribution of the sample 
could be rejected. 

In examining the chi square statistic of each class it was 
femna that major sources of error came from Classes II and III, IV, V, VII 
and XI (Classes II and III had been combined to ease the burden of ex- 
tracting data for the validation sample). It appeared that Classes IV, V, 
VII and IX became much larger at the expense of Classes II and III. This 
would account for the high predictions of total amount of savings. 

Another check was made by taking the difference between the 
predicted number of accounts in the sample and the actual number of 
accounts in each class and multiplying by the respective average amount 
Bimsavings Of each class. The errors in the amount of savings in each 
class are shown in Table XXxI. 

If the validation sample could be taken as a good represen- 
tation of the population then the error in prediction of the population could 
be estimated by multiplying the error in the amount of savings in the vali- 
dation sample by four. Thus, the prediction of total amount of savings 
meuld be high by $11.2 million. The observed error of $13.3 million 
could therefore be considered to be mainly the result in errors in predicting 
the structure of the population. 

Looking at the error in the prediction of amount of savings of 
each class, it can be seen that Class XI is a major contributor to the total 


error. It was suspected that the model failed because of sampling errors 


80 








TABLE XXI 


ERRORS IN PREDICTING THE AMOUNT OF SAVINGS 
IN THE VALIDATION SAMPLE 


LASS PREDICTED ACTUAL, + ERROR IN ERROR IN 
# OF A/C Or 4/C # OF A/C AMOUNT OF 

SAVINGS 

II & III 3435 3699 ~264 -182759 
IV 342 227 +120 +589920 
V 180 144 + 36 +246780 
95 103 - 8 - 71240 

VII 138 93 + 45 +484065 
VIII PD 60 + 12 +155040 
IX oy 4] + 1] +164571 
X 48 50 - 2 - 35582 
XI te? 70 + 52 +1450175 


BOTAL 4484* 4482 toe * +2800971 





= Should equal 4482. Discrepancy caused by rounding error 
** Should equal 0. Discrepancy caused by rounding error 
which resulted in estimating higher probabilities of transition between 
classes with low average amount of savings and those with large average 
amount of savings. 

TO Cheek But this hypothesis the following changes were made 
forCPM X: 

le Accounts found to have made two or more transitions 
between Classes II, III, IV and V and Classes VIII, IX, X and XI were 
removed from the data base as these accounts would not be representative 


of the normal behavior of the population. Eight accounts were rejected 
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according to this rule and CPM X was recomputed with the remaining six 
hundred and fourteen accounts. This modified transition matrix was termed 
hmOD I. 

Z The 90% lower confidence limit was estimated for trans- 
ition probabilities from Classes II, III, IV and V to higher classes. The 
Poisson distribution was used to approximate the binomial distribution 
in cases when the total number of transitions observed was below seven. 
The normal approximation was used when the number of transitions observed 
exceeded seven. This modification was applied to MOD I and termed 
MOD II. 

oe Further adjustments were made to a few transition proba- 
bilities based on the results of the chi square fit using MOD I and MOD II. 
The rationale for the adjustments is as follows: 

Since the data base of accounts is inadequate for estimation 
of population parameters, use the additional data available from the 
validation sample to correct the estimation of certain parameters. Hypothe- 
size that the new matrix, termed MOD III, as the best estimate and proceed 
with the prediction of total number of accounts and total amount of savings 
in the institution. A good fit between predicted total amount of savings 
over the prediction interval would give support to the hypothesis. 

MOD I, MOD II and MOD III are contained in Appendix E. 

The results obtained using the modified matrices are compared 


against predictions using CPM X in Tables XXII and XXIII. 
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It can be seen from Table XXII that the structure of the pre- 
dicted distribution changed substantially with each modification. The 
improvement in fit in the predicted distribution with each modification 
had a corresponding effect in the prediction of total amount of savings. 
However, the predicted total number of accounts were marginally degraded 
by each modification. The changes, however, were not considered to be 
significant as the percentage error was still of the order of less than one 
percent. 

Though the modifications to the transition matrix improved the 
predictions they do not prove that the true transition matrix should be as 
specified by MOD III. However, with the amount of information available 
the best estimate of the transition matrix is MODIII. Although its ability 
to predict the structure of the population has not been put to a test, the 
accurate prediction of total amount of savings encourages one to believe 


that MOD III is close to the true matrix. 


TABLE XXIII 


MODEL I PREDICTIONS OF TOTAL NUMBER OF ACCOUNTS AND AMOUNT 
OF SAVINGS ($M) USING CPM X, MODI, MOD II AND MOD III 


. QUARTER i 2 7 2 Ie 2 I-73 II-73 
TOTAL CPM X 17345 17447 17554 17664 i? 7 76 
-_ MeO! TESTONS 17428 SSS 1762S 17720 
BOCOUNTS MOD II ASO 17424 1 SUS 17609 L702 

WOID ae 17329 17405 17408 7 Soe WG22 
ACTUAL 17354 17483 17485 17746 17820 
TOTAL CPM X 49.65 49.87 Seiad ts oo 60.74 
AMOUNT MOD I 44.64 Aas 7 Slee 00 So Sie 
oP MOD II “43.00 44.80 46.43 47.96 49.38 
SAVINGS MOD IIl 41.94 42.74 43.48 44.16 44.79 
ACTUAL AN ST 4Z.195 42.40 44.13 44.56 
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3 5 Estimates of the Fundamental Matrix 

The fundamental matrix (I - eye was estimated by substituting 
Q from CPM X into the expression. It is displayed in Table XxIV. 

The Ree element of this matrix is the expected number of time 
periods that a new account beginning in Class i will spend in Class j 
before closing. Thus a new account joining, say, Class IV will on the 
average visit Class V for 2.4562 periods during its entire life in the system. 

The expected total time a new account which joins Class i 
spends in the system is the sum of the ith row of the fundamental matrix, 
i . 

The equilibrium distribution is obtained by multiplying the 
distribution of arrivals by M. The results obtained are presented in Table 
XXVI. Results obtained using MOD III are also presented. 

The results are interesting in that they are predictions of the 
final state of the population if current conditions were to prevail. This 
state of equilibrium is reached when the number of new accounts opened 
per quarter balances the number of accounts closed, and the number of 
accounts moving out of each class is balanced by a corresponding number 
of accounts moving in from other classes. The Fundamental matrix obtained 
with CPM X predicts that the population will grow from 17251, at Quarter 
I-72, to a final value of 21734. The population of each class grows 
larger except for Class II. However, as noted earlier, CPM X did not 
predict the total amount of savings accurately; therefore, projection of 
the equilibrium distribution using it has little value except to constrast 


with the results obtained with MOD III. 
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The Fundamental matrix sae with MOD III produced rather 
believable kind of predictions. It predicted that the total number of 
accounts will grow to a maximum of 19363 and each class grows larger 
at the same time. The equilibrium amount of savings in the population 
mm be $53.74 millicn. Thus, if current conditions will prevail the insti- 
tution can expect a growth of another $10 million, from the current level 
of $44 million (as at 30 June 1973), in the passbook accounts. 

The population under consideration, however, did not include 
accounts greater than $100,000. A separate study will therefore be required 
to predict the equilibrium number of accounts in this group of accounts 
which numbered six, at Quarter I-72. 

The expected length of stay of accounts in the system are 
presented in the following table: 

TAREE Covi 


EXPECTED LENGTH OF STAY IN THE SYSTEM 
COMPUTED WITH CPM X AND MODIITI 


CLASS LENGTH OF STAY IN SYSTEM 
(QUARTERS) 
CPM X MOD III 

Il 26 23 
III 29 Gi 
IV 29 26 
V 29 ay 

29 Oy 
VII 29 ag, 
VIII 31 29 
co 31 29 
x 32 30 
XI 33 3] 
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The expected length of stay in the system is almost constant 
for all the classes except for Classes II and XI. The conclusion that can 
be drawn from this observation is that tne length of stay of a saver, in 
the system, is relatively indifferent to the amount of savings he started 
out with. The shorter life of accounts in Class II is a fact that has been 
noticed previously. The longer life of accounts in Class XI is contrary to 
expectation, as one would expect savers who do not have immediate need 
for such large sums, to transfer the passbook account into other types of 
Savings account which yield higher earnings. The observation may be 
explained if these savers do not close their account when funds are trans- 
ferred to other types of accounts. The length of stay would then reflect 
the length of time a saver wishes to remain a customer of the savings 
institution. The Fundamental matrix using CPM X predicts, on the average, 
lengths of stay of 29.8 periods whereas the Fundamental! matrix using 
MOD III predicts 27.6 periods. The smaller total number of accounts 
predicted using MOD III can be explained by the fact that customers spend 
less time in the system. 

Thus, the model shows that efforts to keep customers in the 


system are as important as attracting new customers into the system. 


ar VALIDATION OF MODEL II 


Jig Prediction of Sample Population Behavior 


The transition matrices used in predicting the behavior of the 


sample were estimated by the method described in Chapter II, Section B. 8. 


The elements of the transition matrices that did not have predictors were 


90 





taken from CPM V, the estimate of the time stationary transition matrix 
using data from the first five quarters. The predicted matrices are contained 
in Appendix F. The predicted number of accounts in each class was com- 
pared against the actual number observed. The chi square test was used 
to determine the goodness of fit between the predicted and observed dis- 
tribution of accounts in the sample. 
The results are presented in Appendix I. It was found that 
the predictions matched the observations very closely for the first four 
quarters. The chi square statistic of each of the first four quarters was 
less than 6.7. However, the predictions for the fifth quarter were extremely 
poor. The chi square statistic was 25.02. If the null hypothesis that the 
predicted and observed distributions are the same were true, then this 
chi square statistic would be obtained 0.5 percent of the time. The null 
hypothesis could thus be safely rejected at the 10% level of significance. 
An investigation of the causes of the failure of the model to 
predict accurately for Quarter II-73 showed that the ten predictions of 
transition probabilities for Quarter II-73 had altered the transition matrix 


for Quarter II-73 substantially. Two exogenous variables X,, prime com- 


4 
mercial paper rate, 4-6 months and Kes U.S. Government securites rate, 
6 months, were considerably higher in Quarter II-73 than in the earlier 
quarters. Thus the predictors were used beyond the data base from which 
they were derived. This could lead to nneeee sted results. 


To verify the hypothesis that Model II failed in Quarter II-73 


because of the use of ‘some predictors beyond the data base on which they 
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were derived, predictions were repeated using a matrix with predictors that 
had x. as an explanatory variable removed. The chi square statistic obtained 
with this modified matrix was 14.87, a substantial improvement from that 
obtained without the modification. The ninetieth percentile of the chi 
square distribution with ten degrees of freedom is 15.99. Thus the null 
hypothesis could not be rejected at the 10% level of significance. It was 
therefore concluded that hypothesis on the failure of the model is correct. 

2. Prediction of Population Behavior 

The complete Model II was used in the prediction of the behavior 
of the population. The predicted number of new accounts opened in each 
Quarter was computed in Chapter III, Section B. 9. The predicted number 
of new accounts entering each class was presented in Chapter III, Section 
B. 10. The transition matrix used was the same as that used in the pre- 
diction of sample population behavior in sub-section 1]. 

With experience gained in earlier predictions with Model I, 
high predicted total amount of savings was expected. The modifications 
applied to the transition matrix of Model I were also applied to Model II. 
The predicted total number of accounts and total amount of savings are 
presented in Table XXVIII. 

The total number of accounts predicted by Model II matched 
the observed values closely for Quarters II-72, III-72 and IV-72, but 
diverged quite widely by Quarter II-73. The Bee total amount of 
Savings was high but the divergence increased substantially in Quarter 


73. 
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PREDICTED TOTAL NUMBER OF ACCOUNTS AND TOTAL AMOUNT OF SAVINGS 


TABLE XXVIII 


(SM) BY MODELII, USING CPM X, MOD I, MOD II AND MOD III 


QUARTER 1-72 I-72 Iv-72 1-73 [1-73 

TOTAL CPM X 17307. (17380 17448 #17985 ~=18547 
ead MOD I 17305 17374 17438 + 17973 + ©«©18534 
MODII 17304 17370 17430 17966 18531 

MOD III 17304 17364 17414 17953 18526 

ACTUAL 17354 17483 17485 17746 17820 

TOTAL CPM X 45.61 49.79 53.68 58.20 62.49 
cn. ee 1 44.59 47.88 50.98 54.80 58.47 
(MILLION MODII 42.95 44.69 46.32 48.75 51.13 
Bee LARS) MOD III 41.90 42.64 43.31 44.80 46.19 
ACTUAL 41.57 42.15 42.40 44.13 44.56 


The hypothesis, that the model failed to yield accurate pre- 
dictions because the predictors of transition probabilities were used beyond 
the range of data used to obtain the predictors, was put to another test by 
predicting with a transition matrix that had predictors with Xe as explana- 
tory variable removed. The predictions are presented in Table XxXIX. 

It can be seen that the predicted total number of accounts has 
improved considerably by this change to the transition matrices. The 
improvement to predictions of total amount of savings is not so pronounced. 

The validation sample of 4483 accounts taken from the Quarter 
I-73 population was used to check if Model II predicted the pODUtion 
structure accurately. The predictions obtained with CPM X, MODI, MOD 


II and MOD III are presented in Table XXX Predictions by Model II' are 


presented in Table XXX]. 
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TABLE XXIX 


PREDICTED TOTAL NUMBER OF ACCOUNTS AND TOTAL 
AMOUNT OF SAVINGS BY MODEL IT’ 


QUARTER fey e72—=«CW-72—=:té«‘C=73~Cé=«*C-723 

TOTAL CPM X 17320 17373 17404 17738 18068 
eid MOD I 17310 17354 17375 17697 18016 
MODII 17310 17350 17365 17681 17992 

MOD III 17309 17343 17346 17645 17937 

ACTUAL 17354. 17483 17485 17746 17820 

TOTAL CPM X 45.62 49.69 53.34 57.44 61.23 
ccs a) MOD I 44.60 47.76 50.62 54.01 57.14 
(MILLION MOD II 42.96 44.58 46.00 48.04 49.92 
ee) MODIIl 41.90 42.55 43.06 44.28 45.37 
ACTUAL 41.57. 42.15 42.40 44.13 44.56 


It can be seen that the predicted distribution improved with 
each modification. The error in predicting the total amount of savings 
can be attributed to the error in the prediction of number of accounts in 
each class. As an example, the error in predicting the number of accounts 
in Classes XI, VII and IV could account for $2.9 million in the prediction 
of total amount of savings for Quarter I-73 using MODII. 

Though the predicted distribution using MOD III fitted the 
observed distribution very closely, the error in predicting the number of 
accounts in Class XI could account for $0.67 million of the error in pre- 
dicting the total amount of savings for the entire population. This again 


demonstrates the importance of accurate predictions of number of accounts 


in classes with large average amount of savings. 
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eS . 


COMPARISON OF MODEL I AND MODEL II 


sample Population Behavior 


The chi square statistics obtained in the test of goodness of 


fit between the predicted distributions and the observed distribution were 


used as a measure of the predictive power of the two models. 


predictors of transition probabilities which had X 


variable. 


Model II' denotes Model II modified by the deletion of five 


F as an explanatory 


The chi square statistics obtained with Model I, Model II and 


Model II' are presented in Table XXXII. 


CPM 


ig 


iE 


Model I. 


TABLE XXxII 


COMPARISON OF CHI SQUARE STATISTICS 
OBTAINED WITH MODELS I, II ANDII' 


MODEL __II-72 
I 3.49 
I 3.60 
II’ eA 
I 759 
I 6.76 
ie 6.99 
I 3.26 
II DOG 
fe eel 


QUARTER 
MT 72 IV-72 a ite hs 
2.45 ex Os 10.74 dlp 
I seks 6 le 4.35 Zong 
Zoe 8.69 8.84 14.87 
13.84 Soy, oles Ih Goo? 
11.46 24.65 Sic) 10) 76.64 
WZ GS Sus 43.05 GO. 39 
| Les a5) ous Cin 
0.94 282 vale Fon 
1205 4.71 S299 704 


Except for Quarter II-73, Model II was generally superior. to 


Model II' improved the predictions for Quarter [I-73 but did 


not perform as well as Model II for the other quarters. The results were 


expected as Model II, having greater flexibility, should perform better 
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under normal situations. Model II', with only five predicted elements in 
its transition matrix, would be expected to be less responsive to changes 
in external conditions, thus would not perform as well as Model II. Model 
I, being completely indifferent to external conditions, should be expected 
to be the poorest performer among the three models. 

The predicted total amount of savings predicted by Models I, 


II and II' are presented in Table XXXIII. 


TABLE XXXIITT 


COMPARISON OF PREDICTED TOTAL AMOUNT OF 
SAVINGS ($M) BY MODELS I, II AND II' 


CPM MODEL I-72 i 72 IV-72 lene he. 
Vv I 3.672 3.509 2851 Sell 3.057 
V II 3.674 3.507 3.346 Do 3.043 
V II! 3.669 3.497 2 BOE 3.180 3.024 
IT I 3.553 3.293 3.060 2.851 2.664 
II II 3.574 3.329 oe4 2.937 2.764 
I II' 3.567 3.315 3.092 2.902 De 
xe I omer 3.613 3.500 3.389 3.280 
x I 3.729 3.609 3.488 30373 3.234 
X II' e726 3.603 3.479 3.367 rel 

ACTUAL PaCZ 38585) 3-509 3.404 3.418 


The predictions between the three models were pretty close. 
In view of the variability of the predictions of total amount of savings it 
was not possible to state which of the three models performed better. 
2: Behavior of Entire Population 
Both models predicted total number of accounts very closely 


for the first three quarters. The performance of Model II deteriorated 
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badly in the fifth quarter, Quarter II-73. The failure of Model II in 

Quarter II-73 was attributed to the failure of the predictors of transition 
probabilities to predict beyond the data base from which they were derived. 
Predictions made with a matrix modified by the removal of predictors which 
had Xe as an explanatory variable were closer to the actual value for 
Quarters I-73 and II-73 than predictions by Model II. Table XXXIV compares 
the total number of accounts predicted by Model I, Model II and Model II', 
Model II modified as described above. 


TABLE XXXIV 


TOTAL NUMBER OF ACCOUNTS PREDICTED BY 
Meola MODELED I] AND MODEL II° USING MOD III 


MODEL. I-72. ° I1-72 i722 F=73 oye 

TOTAL I 17329. 17405 17480 17552 17622 
NUMBER OF ; 

Loon II 17304. 17364 17414 17953 18526 

II' 17309 17343 17346 17645 17937 

ACTUAL 17354 17483 17485 17746 17820 


It can be seen that Model I predictions are closer to the 
observed values for the first three quarters. However, unlike Models 
II and II', Model I could not predict the sudden increase in the number 
of accounts in Quarter I-73. This, again, shows that Model I is appli- 
cable only when external conditions remain constant. 

Both models were equally bad in predicting the total amount 
The cause for the failure was attributed to sampling errors. 


of savings. 


Similar modifications were made to the transition matrix of both models. 
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TOTAL NUMBER OF ACCOUNTS IN THOUSANDS 
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The improvement finally achieved was substantia! as can be seen in the 


following table: 


TABLE XXXV 


COMPARISON OF TOTAL AMOUNT OF SAVINGS PREDICTED 
BaeOomEnle ti IAN Meer OR QUARTER I]-73 


MODEL CPMX MOD 1 MOD II MODIII ACTUAL 
TOTAL ] 60.74 J0.07 AS Be 6) 44.79 1a 
II 62.49 98.47 oy Bas 46.19 44.56 
SAVINGS ity bec Sf oie. AQ. 92 Se oes 44.56 


Predictions using CPM X, MODI and MOD II are so different 
from the observations that the difference between Model I and Model II’ 
predictions are considered insignificant. In the case of predictions made 
using MOD III, the errors between prediction and observation are too small 
to discriminate between Model I and Model II' using just one point. Thus, 
Table XXXVI comparing the predictions of the three models using MOD III 


over the entire period of prediction, is presented below. 


TABLE XXXVI 


COMPARISON OF TOTAL AMOUNT OF SAVINGS PREDICTED 
BY MODELS I, If AND II' USING MOD III 


MODEL si TS 7/72 72 I-73 y3 
TOTAL ] 41.94 42.74 43.48 44.16 44.79 
-_ IT 41.90 42.64 AS o1 44.80 aOaag 
SAVINGS ity 41.90 42.55 43.06 44.28 45.37 
ACTUAL Ble o7 42.15 42.40 44.13 44.56 


The predictive power of each model in predicting the size dis- 


tribution of the population could not be compared as the validation sample 
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was also used in estimating the parameters of MOD III. Thus, another 
Sample would have to be taken to validate this capability of the two 
models. It is regrettable that this step could not be carried out at the 
time of the writing of this report because of lack of time. It is therefore 


proposed that the models be validated again at a later date. 
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V. SUMMARY AND CONCLUSIONS 


A. SUMMARY 

The purpose of this research has been to develop a model that can 
be used to study the structure of a population of savings accounts ina 
Savings institution and to predict future levels of savings in the insti- 
RieVOn . 

Two stochastic models were developed and evaluated in this study. 
The first model was based on the time stationary Markov chain model 
extended to cover the phenomena of opening and closing of accounts. 
The population was divided into ten classes and the continuous distribu- 
tion of amount of savings of each account was idealized by a discrete 
distribution with ten classes. The classes were numbered from two to 
Pleven. The class intervals of Classes II to IX were $2,000. Class X 
contained all accounts with balances between $16,000 and $19,999 and 
Class XI contained all accounts with balances between $20,000 and 
$100,000. Class I was used as a reservoir for all the accounts that had 
closed. The parameters of Model I were assumed to be constant over the 
period of observation and prediction. 

The second model was based on the nonstationary Markov chain 
model. The parameters were not assumed to be constant. An econometric 
model was used to relate the estimates of the parameters to a set of 
exogenous variables. Predictors of the parameters, if found to be signi- 


ficant, were used to predict future values of the parameters. 
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By assuming that the mean of the amount of savings of accounts in 
each class remain constant with time the total amount of savings in each 
class could be computed by multiplying the number of accounts in each 
class by the mean. 

The parameters of the two models were estimated with data obtained 
from the local branch of a savings institution. The level of savings ofa 
stratified sample of 622 accounts were observed over a period of ten 
quarters, Quarter I-71 to Quarter II-73. Movements of ayer uintils between 
classes were recorded as transitions between the respective classes. The 
transition probability matrix was estimated by dividing the number of 
transitions from each class by the total number of accounts in the class 
at the beginning of the quarter. 

The total number of new accounts opened in each quarter of the 
period of observation was used to estimate the arrival rate or expected 
number of new accounts per quarter. 

Two hundred and fifty new accounts were randomly selected each 
quarter. These were.used to determine if the size distribution of new 
accounts had changed during the period of observation. These accounts 
were classified into the ten classes described earlier and the probability 
of a new account being in each class estimated. These estimates were 
transformed into logits and regressed against a set of exogenous variables. 
The regressions that were considered significant were used as predictors 
for future values of the probability of a new we eount entering a particular 


class. 
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The structure of the population of savings accounts for Quarter [I-72 
was determined and used as the initial distribution in predictions of the 
behavior of the population. 

The chi square test was used to determine if the transition matrix 
had changed during the period of observation and if the predicted size 
distributions matched the observed distributions. 

The parameters of Model I were estimated using data from the first 
five quarters. The model was then used to predict the size shesthalureton 
of accounts of the sample and the amount of savings in the sample popu- 
Pavion . 

The size distribution of the population of savings accounts was 
predicted using the distribution of the population at Quarter I-72 as the 
initial distribution. Total number of accounts and total amount of savings 
were also predicted. 

Most of the parameters of Model II were estimated using data from 
the first five quarters. Of 110 transition probabilities 10 were found to 
vary significantly with the set of exogenous variables. Thus the transition 
matrix of Model II contained only ten predicted elements. The predictors 
were determined using data from the first eight quarters. 

Model II was used to predict the size distribution of accounts in 
the sample and the amount of savings in the sample. It was then used 


to predict the behavior of the population. 
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A sample comprising one fourth of the population of Quarter I-73 
was used to test if the size distribution predicted by both models were 
any good. Predicted total number of accounts and total amount of savings 
were also tested by comparison with actual values observed over the 


prediction horizon. 


B. CONCLUSIONS 
de Model I 

The hypothesis that the stochastic processes were stationary 
during the period of observation could not be rejected at the ten percent 
level of significance. Thus the assumption of stationarity could be con- 
sidered to hold. 

The predicted size distribution of the sample matched the 
observed distribution closely. The largest chi square statistic obtained 
was 11.91. This corresponded to the seventieth percentile of the chi 
square distribution with ten degrees of freedom. It was concluded that 
the sample of 622 accounts behaved as described by the Markov chain 
model. 

The predicted total amount of savings differed from the actual 
amount by a maximum of ten percent. It was concluded that Model I could 
predict total amount of savings but the variability in the prediction could 
be rather large as a small number of savers with large accounts could 
cause large fluctuations in the total amount of savings. 

Model I failed to predict the behavior of the population. The 


failure was attributed to errors in estimation of parameters of the transition 
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matrix. This observation was supported by the fact that predictions 

were substantially improved by changing the values of some transition 
probabilities. The additional data in the validation sample was used to 
adjust the estimates of a few transition probabilities. Predictions of 
total amount of savings made with this modified matrix were greatly 
improved. The maximum error was found to be half a percent. A good 

fit between predicted and total amount of savings by itself is not suf- 
ficient to indicate that the model has predicted the size distribution of 
the population correctly. However, as the predicted size distribution 

of the population of Quarter I-73 has been made to fit the observed dis- 
tribution and if the structure of the population did not change drastically, 
over the period of observation, then itis plausible that the tme transition 
matrix is not very different from the modified matrix. It is regrettable 
that time did not permit the drawing of further samples to validate the 
model so that a firmer conclusion could be reached. 

The fundamental matrix, obtained from the 'best' estimate of 
the transition matrix, predicted that the maximum total number of accounts 
in the institution will be 19363 , and the maximum total amount of savings 
Bemtrinuted by accounts below $100,000 will be $53.74 million, if the 
conditions existing during the period of the data were to persist. 

The average time an account remains opened was predicted 
to be 27.6 quarters, 6.9 years. The expected length of stay of an account, 
in the system, appeared to be independent of the amount of savings in 


the account when it first joined the system except if the amount was 
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less than $2,000 or more than $20,000. It was concluded that a saver's 
desire to remain a customer of the institution did not depend on his initial 
deposit. 

A small increase in the expected length of stay of an account, 
in the system, could have a large effect on the total amount of savings. 
Thus efforts to keep customers contented and remain longer in the system 
are important. 

ti Model! II 

The predicted size distributions of the sample were very close 
to the observed distribution for the first four periods. The maximum chi 
Square statistic was 6.7 which is less than the thirtieth percentile of 
the chi square distribution with ten degrees of freedom. The chi square 
Statistic for the fifth quarter, Quarter II-73 shot up to 25.02. An inves- 
tigation showed that the model failed because five of the predictors of 
transition probability were used beyond the data base on which they were 
derived thus giving erroneous predictions for Quarter II-73. It was 
therefore concluded that Model II could predict accurately provided the 
predictors are not required to predict beyond the data base on which they 
were derived. 

The maximum percentage of error in predicting the total amount 
of savings was about ten. The predictions were very close to the pre- 
dictions made by Model I. 

Model II fared no better than Model I in the prediction of 


population behavior and for the same reasons as stated earlier. 


Uy i 





O. Discussion 

Both models performed credibly in predicting the behavior of 
the sample of 622 accounts. This is encouraging as it leads one to con- 
clude that a population of savers does possess the Markovian property. 

Failure of the models to predict the behavior of the entire 
population correctly was attributed to errors in the estimation of par- 
ameters. This explanation is plausible, as modifications to the transition 
matrix, using additional data from the validation sample, evaca pre- 
dictions of total amount of savings that were accurate to half a percent. 
As itis difficult to conceive, how a random sample could exhibit the 
Markovian behavior, with the population not possessing that characteristic, 
one is further led to believe in the above explanation. 

If external conditions do not have much influence on the be- 
havior of the population of savers then Model I, because of its simplicity, 
is the ideal model to use. Model I could still be used if the rate of 
change of the population behavior is slow. Transition probabilities 
could be estimated each quarter and exponential smoothing used to adjust 
the past estimates with fic additional information. However, this model 
does pot allow the use of additional information regarding the operating 
environment to improve the predictions. 

Model II has not been given an opportunity to demonstrate 
its capability because of the limited data base. It has the advantage 
of improvement with additional knowledge of the operating environment. 


However, its main limitation is in the requirement of predictions of 
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values of exogenous variables to predict future values of the parameters 
of the model. Thus, predictions of Model II are only as good as predic- 
tions of exogenous variables. The success of the model, therefore, 
depends to a great extent on the judgement of the forecaster. 

4. Areas for Further Research 

The Markovian property of a population is an important popu- 
lation characteristic. The results observed in the application of the models 
to the sample should be verified using a larger number of accounts, pref- 
erably the entire population. A computerized bookkeeping system should 
be able to take on the additional task of counting the number of transitions 
between classes without much additional effort. 

The variability of predictions in total amount of savings 
could be reduced if the movement of large accounts could be predicted. 
Accounts with a balance exceeding $100,000 could be the subject of 
another study. 

The present study did not deal with the interaction between 
various types of accounts in a savings association. Movement of accounts 
between different types of accounts has an impact on the total amount of 
savings in the institution. This area merits further research especially 
if management desires to know the future level of savings of the whole 
institution. 

The variance of the predictions for more than one period is 


difficult to derive as the elements of the transition matrix are sums of 
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products of normal random variables, when the sample size is large. 
An alternate approach would be to use the Monte Carlo method to obtain 
an estimate of the variance. 

The specification of the econometric models used in predicting 
the transition probabilities, arrival rate and distribution of new accounts 
does not imply that the true reiationships between parameters of the model 
and exogenous variables are as specified. This study has merely scratched 
the surface of the problem of identifying casual relationships between the 
parameters of the model and external factors. Further research in this 
area is necessary before reliable predictors can be developed for the 


parameters. 


©. RECOMMENDATIONS 

Model I can be turned into an operational tool with little effort. It 
is recommended that the parameters of the model be updated each aenice 
to reflect slight changes that may have taken place. If possible, the 
entire population be used to estimate the parameters. 

Model II can Be made operational only after further research has 


been conducted to determine the predictors of the parameters of the model. 





APPENDIX A 
emi VATION OF THE VARIANCES OF NUMBER OF ACCOUNTS 
AND AMOUNT OF SAVINGS IN THE POPULATION FOR SINGLE 
STEP TRANSITION 
(1) EXPECTATION, VARIANCE AND COVARIANCE OF RANDOM SUMS 
Let WN be an integer random variable 
M be an integer random variable 
x. ie as 1. ode 


“ be 1.1.0. 


j=l 
N M 
E(XY) — ees X 4) 
=1 ° ja ? 
N 
= £E( X.,Y,) 
= j=. | ? 
= E(MN)E(X,Y.) 
Cov(X,Y) =  E(XY) + E(X)E(Y) 


E(MN)JE(X.Y,) + E(N)E(X, JE(M)ELY ) 


it x. and fs are uncorrelated then 


Cov(X,Y) = E(MN)E(X,)E(Y,) + E(N)E(M)E(X,)E(Y,) 
= E(x, )E(Y,)(E(MN) + E(M)E(N)) 
= E(X,)E(Y,)Cov(M , N) 
Var(X) = | E’(X,)Var(N) + E(N)Var(X,) 


eS 





Note: E(X) = E(N)E(X.) can be derived as follows: 


E(X|N=n)P(N=n) 


tI 

2 

lI 
M8 


2) 
lI 
a 


g 


= nE(X)P(N=n) 


S 
II 
S 


= E(N)E(X) 
(2) EAPECTATION AND VARIANCE OF NUMBER OF ACCOUNTS 
Let n = number of accounts in the ith class at beginning 


of time period a. 


Ps = transition probability between classes i and j. 
eee... i | Spc, 2.2 © 
oe = number of transitions between classes i and j 


during period a. 


a+] ; eee: 
ay - number of accounts in the jth class at beginning 
of time period atl. 
aap ll Pare 
N = total number of accounts in the system at 


beginning of time period atl. 
The assumption that accounts moving out of a class are distributed in 
accordance with a multinomial distribution with parameters (PD. 9 Py 
Pio Pat Psa is implicit in the Markov chain model. If it can be further 
assumed that accounts moving out of different classes are independent 


then the following expressions could be obtained. 
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Brey 
J=2 
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> ae 
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Cov( >. a 
i=2 
m m 
ou Xj > x 
1=2 l=2 


nel? 


kj = 0 by 


assumption of independence 


Covi x 
1) 


between accounts exiting 


from difierent classes 


im | 
Coen ne) 
j= c=8 J 
j7#k - 
ard) 
f= ik 
> «JE 5 

ux) BC DE 2, x1) 





1 
3 
ve 
td 
ae 
a 
=. 
eu 
ME 
tr} 
i 
* 
xe 
es 
ee 


i=2 l=2 i=2 l=2 
m m 
= ps 2. E(x. Xy,) x Ey) 
m 
= Te, SHOE Xy) 
By assumption Cov(x, ,,X1)) — 0a al 


m 
atl atl Naa 
Covin. Ao = Covi. 
( ; ) i. ( ' ie! 


As ae and x are multinomial random variables from the same distribution 


k 
Cov(x, -.X)) = eae 
m 
ae eemeci ts) 
OCS My = am n Ps Pip 
j7k 


(3) EXPECTATION AND VARIANCE OF AMOUNT OF SAVINGS 


Let Z 


size of the kth account that has entered the jth class 


kj 
atl ae 
2} = amount of savings in class j at the beginning of 
period atl 
atl] a oe 
Za = total amount of savings in the system at the beginning 
of period atl 
m x 
i me 
ze 1 _ ie Z, 
; i=2 k=] J 
m Xe 
aqr ll Lj 
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